Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demonstrate roundtrip export/import works #2940

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

epugh
Copy link
Contributor

@epugh epugh commented Jan 6, 2025

https://issues.apache.org/jira/browse/SOLR-13689

Description

Trying to understand best ways of round tripping data. Export from one collection and index into another collection. Use our existing tooling as much as possible.

Solution

Starting with a BATS test to demonstrate that bin/solr export and bin/solr post with .json file works.

Tests

BATS.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jan 6, 2025
@epugh epugh marked this pull request as ready for review January 10, 2025 12:51
@epugh epugh requested review from janhoy and gerlowskija January 10, 2025 12:51
@epugh
Copy link
Contributor Author

epugh commented Jan 10, 2025

I'd love a plus one on the this before I merge... For a lot of multi step processes, like exporting and importing data, I find modeling them as BATS tests makes them easier to understand in context. I know that moves these BATS tests towards being integration or even system style tests... I suspect that we may need to split the bats tests into system/integration/unit tests in the future..

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Love the how-to nature of a BATS test here.

@epugh
Copy link
Contributor Author

epugh commented Jan 13, 2025

Cool. Love the how-to nature of a BATS test here.

I have a dream someday to re-organize all the Solr documentation along the lines of Diátaxis. I've used this model a few times and quite like it.

@@ -165,6 +165,20 @@ Once the alias is in place and you are satisfied you no longer need the old data
One advantage of this option is that you can switch back to the old collection if you discover problems our testing did not uncover.
Of course this option can require more resources until the old collection can be deleted.

=== Exporting/Importing Data from Solr

Sometimes you don't want to run your full ETL pipeline to reindex into another collection, you just want to take the data in your existing collection, export it, and then import it back.
Copy link
Contributor

@gerlowskija gerlowskija Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-0] AFAIK bin/solr export uses the /export handler and can only return fields that have docValues enabled.

That's a huge limitation that we should probably mention here! Imagine someone's confusion when they follow these docs and somehow lose all of the text fields they were searching on!


There are a number of third party tools that do this, see https://solr.cool/ for more information. However, if you want to use what ships with Solr then we have some options:

1. Use `bin/solr export` with the JSON output format (`.json`), and the `bin/solr post` tool to post that data back.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] How would a user choose between these three options? Or put differently - why would they choose one over the others?

If there's no strong differentiator between the three, is there value in mentioning them all individually?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants